Auditing an existing site is one of the most important tasks that
SEO professionals encounter. SEO is a relatively new field, and many of
the limitations of search engine crawlers are non-intuitive. In addition,
web developers are generally not well versed in SEO. This includes those
who have developed CMSs, so there is a lot of opportunity to find problems
when conducting a site audit.1. Elements of an Audit
Your website needs to
be a strong foundation for the rest of your SEO efforts to succeed. An
SEO site audit is often the first step in executing an SEO
strategy.
The following sections identify what you should look for when
performing a site audit.
1.1. Usability
Although this may not be seen as a direct SEO issue, it is a
very good place to start. Usability affects many factors, including
conversion rate, as well as the propensity of people to link to a
site.
1.2. Accessibility/spiderability
Make sure the site is friendly to search engine spiders.
1.3. Search engine health check
Here are some quick health checks:
Perform a site:yourdomain.com search in
the search engines to make sure all your pages appear to be in the
index. Compare this to the number of unique pages you believe you
have on your site.
Test a search on your brand terms to make sure you are
ranking for them (if not, you may be suffering from a
penalty).
Check the Google cache to make sure the cached versions of
your pages look the same as the live versions of your
pages.
1.4. Keyword health checks
Are the right keywords being targeted? Does the site
architecture logically flow from the way users search on related
keywords? Does more than one page target the same exact keyword
(a.k.a. keyword cannibalization)?
1.5. Duplicate content checks
The first thing you should do is to make sure the non-www
version of your pages (i.e.,
http://yourdomain.com) 301-redirects to the www
version of your pages (i.e.,
http://www.yourdomain.com), or vice versa (this
is often called the canonical redirect). While
you are at it, check that you don’t have https: pages that are
duplicates of your http: pages. You should check the rest of the
content on the site as well.
The easiest way to do this is to take unique strings from each
of the major content pages on the site and search on them in Google.
Make sure you enclose the string inside double quotes (e.g., “a phrase
from your website that you are using to check for duplicate content”)
so that Google will search for that exact string.
If your site is monstrously large and this is too big a task,
make sure you check the most important pages, and have a process for
reviewing new content before it goes live on the site.
You can also use commands such as inurl: and intitle: to check for duplicate content. For
example, if you have URLs for pages that have distinctive components
to them (e.g., “1968-mustang-blue” or “1097495”), you can search for
these with the inurl: command and
see whether they return more than one page.
Another duplicate content task to perform is to make sure each
piece of content is accessible at only one URL. This probably trips up
more big, commercial sites than any other issue. The issue is that the
same content is accessible in multiple ways and on multiple URLs,
forcing the search engines (and visitors) to choose which is the
canonical version, which to link to, and which to disregard. No one
wins when sites fight themselves—make peace, and if you have to
deliver the content in different ways, rely on cookies so that you
don’t confuse the spiders.
1.6. URL check
Make sure you have clean, short, descriptive URLs.
Descriptive means keyword-rich but not
keyword-stuffed. You don’t want parameters appended (or have a minimal
number if you must have any), and you want them to be simple and easy
for users (and spiders) to understand.
1.7. Title tag review
Make sure the title tag on each page of the site is unique and
descriptive. Ideally, don’t waste your time (or limited space) by
including the brand name of your organization in the URL. If you must
include it, the brand name should show up at the end of the title tag,
not at the beginning, as placement of keywords at the front of a URL
brings ranking benefits. Also check to make sure the title tag is
fewer than 70 characters long.
1.8. Content review
Do the main pages of the site have enough content? Do these
pages all make use of header tags? A subtler variation of this is
making sure the percentage of pages on the site with little content is
not too high compared to the total number of pages on the
site.
1.9. Meta tag review
Check for a meta robots tag
on the pages of the site. If you find one, you may have already
spotted trouble. An unintentional NoIndex of NoFollow tag could really mess up
your search ranking plans.
Also make sure every page has a unique meta description. If for
some reason that is not possible, consider removing the meta
description altogether. Although the meta description tags are not a
significant factor in ranking, they may well be used in duplicate
content calculations, and the search engines frequently use them as
the description for your web page in the SERPs; therefore, they affect
click-though rate.
1.10. Sitemaps file and robots.txt file verification
Use the Google Webmaster Tools robots.txt verification tool to
check your robots.txt file. Also
verify that your Sitemaps file is identifying all of your (canonical)
pages.
1.11. Redirect checks
Use a server header checker such as Live HTTP Headers to
check that all the redirects used on the site return a 301 HTTP status
code. Check all redirects this way to make sure the right thing is
happening. This includes checking that the canonical redirect is
properly implemented.
Unfortunately, given the non-intuitive nature of why the 301 is
preferred, you should verify that this has been done properly even if
you provided explicit direction to the web developer in advance.
Mistakes do get made, and sometimes the CMS or the hosting company
makes it difficult to use a 301.
1.12. Internal linking checks
Look for pages that have excess links. Google advises 100 per
page as a maximum, although it is OK to go with more on more important
and heavily linked-to pages.
Make sure the site makes good use of anchor text on its internal
links. This is a free opportunity to inform users and search engines
what the various pages of your site are about. Don’t abuse it, though.
For example, if you have a link to your home page in your global
navigation (which you should), call it “Home” instead of picking your
juiciest keyword. The anchor text of internal links to the home page
is not helpful for rankings anyway.
The search engines view this particular practice as spammy, and
it does not represent a good user experience. Keep using that
usability filter through all of these checks!
Note:
A brief aside about hoarding PageRank: many people have taken
this to an extreme and built sites where they refused to link out to
other quality websites, because they feared losing visitors and link
juice. Ignore this idea! You should link out only to quality
websites. It is good for users, and it is likely to bring ranking
benefits (through building trust and relevance based on what sites
you link to). Just think of the human user and deliver what he’d
want. It is remarkable how far this will take you.
1.13. Avoidance of unnecessary subdomains
The engines may not apply the entirety of a domain’s trust and
link juice weight to subdomains. This is largely due to the fact that
a subdomain could be under the control of a different party, and
therefore in the search engine’s eyes it needs to be separately
evaluated. In the great majority of cases, subdomain content can
easily go in a subfolder.
1.14. Geolocation
If
your concern is more about ranking for chicago
pizza because you own a pizza parlor in Chicago, make sure
your address is on every page of your site. You should also check your
results in Google Local to see whether you have a problem
there.
1.15. External linking
Check the inbound links to the site. Use a backlinking tool such
as Yahoo! Site
Explorer, Linkscape, Majestic-SEO, or Link Diagnosis to collect
data about your links. Look for bad patterns in the anchor text, such
as 87% of the links having the critical keyword for the site in them.
Unless the critical keyword happens to also be the name of the
company, this is a sure sign of trouble. This type of distribution is
quite likely the result of purchasing links or other manipulative
behavior.
On the flip side, make sure the site’s critical keyword is
showing up a fair number of times. A lack of the keyword usage in
inbound anchor text is not good either. You need to find a
balance.
Look to see that there are links to pages other than the home
page. These are often called deep links and they
will help drive the ranking of key sections of your site. You should
also look at the links themselves. Visit the linking pages and see
whether the links appear to be paid for. They may be overtly labeled
as sponsored, or their placement may be such that they are clearly not
a natural endorsement. Too many of these are another sure trouble
sign.
Lastly on the topic of external links, make sure that there are
enough links, and that there are also enough high-quality links in the
mix. How does the link profile for the site compare to the link
profiles of its major competitors?
1.16. Page load time
Is the page load time excessive? Too long a load time may slow
down crawling and indexing of the site. However, to be a factor, this
really needs to be excessive—certainly longer than five seconds, and
perhaps even longer than that.
1.17. Image alt tags
Do all the images have relevant keyword-rich image
alt attributes text and filenames? Search engines
can’t easily tell what is inside an image, and the best way to provide
them with some clues is with the alt attribute and the filename of the image.
These can also reinforce the overall context of the page
itself.
1.18. Code quality
Although W3C validation is not something the search engines
require, checking the code itself is a good idea. Poor coding can have
some undesirable impacts. As we previously discussed, use a tool such
as SEO Browser to see
how the search engines see the page.